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THERMOSTABLE POLYMERASES HAVING ALTERED FIDELITY 

This application claims the benefit of priority 
of United States Provisional Application serial No. 
5 60/031,496, filed November 27, 1996, the entire contents 
of which is incorporated herein by reference. 

This invention was made with government support 
under grant number OIG-R35-CA-39903 awarded by the 
National Institutes of Health and grant number BIR9214821 
10 awarded by the National Science Foundation. The 
government has certain rights in the invention. 



BACKGROUND OF THE INVENTION 

The present invention relates generally to 
thermostable polymerases and more specifically to methods 
15 for identifying polymerase mutants having desired 
fidelity. 

Every living organism requires genetic 
material, deoxyribonucleic acid (DNA), to pass a unique 
collection of characteristics to its offspring. Genes 

20 are discreet segments of the DNA and provide the 

information required to generate a new organism. Even 
simple organisms, such as bacteria, contain thousands of 
genes, and the number is many fold greater in complex 
organisms such as humans. Understanding the complexities 

25 of the development and functioning of living organisms 
requires knowledge of these genes. However, the amount 
of DNA that can be isolated for study has often been 
limiting. 
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A major breakthrough in the study of genes was 
the development of the polymerase chain reaction (PCR) . 
PCR amplifies genes or portions of genes by making many 
identical copies, allowing isolation of genes from very 
5 tiny amounts of DNA. The motors for PCR are DNA 

polymerases that copy the DNA of each gene during each 
round of DNA synthesis. -Using oligonucleotides , that 
determine the start and termination of DNA synthesis, a 
single gene can be replicated into millions of copies. 

10 This process has created a revolution in biotechnology 
and has been used extensively, for the identification of 
mutant- genes tha.t are responsible for or associated with 
inherited human diseases. It is now possible to identify 
a mutant gene in a single cell, amplify the gene a 

15 million times, and establish the nature of the mutation. 
One application of identifying a mutant gene is the 
determination of genetic susceptibility to disease, which 
can be mapped by gene amplification and DNA sequencing. 

DNA polymerases function in cells as the 
20 enzymes responsible for the synthesis of DNA. They 
polymerize deoxyribonucleoside triphosphates in the 
presence of a metal activator, such as Mg 2+ , in an order 
dictated by the DNA template or polynucleotide template 
that is copied. Even though the template dictates the 
25 order of nucleotide subunits that are linked together in 
the newly synthesized DNA, these enzymes also function to 
maintain the accuracy of this process. The contribution 
of DNA polymerases to the fidelity of DNA synthesis is 
mediated by two mechanisms. First, the geometry of the 
30 substrate binding site in DNA polymerases contributes to 
the selection of the complementary deoxynucleoside 
triphosphates. Mutations within the substrate binding 
site on the polymerase can alter the fidelity of DNA 
synthesis. Second, many DNA polymerases contain a 
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proof-reading 3'-5' exonuclease that preferentially and 
immediately excises non-complementary deoxynucleoside 
triphosphates if they are added during the course of 
synthesis. As a result, these enzymes copy DNA in vitro 
5 with a fidelity varying from 5 X 10"* (1 error per 2000 
bases) to 10" 7 (1 error per 10 1 bases) {Fry and Loeb, 
Animal Ce ll DNA Polymerases, pp. 221, CRC Press, Inc., 
Boca Raton, FL. (1986) ; Kunkel, T.A., J, BjgJ, T Chem, 
267:18251-18254 (1992) ) . 



10 In vivo, DNA polymerases participate in a 

spectrum of DNA synthetic processes including DNA 
replication, DNA repair, recombination, and gene 
amplification (Kornberg and Baker, DNA Replication, pp. 
929, W.H. Freeman and Co., New York (1992)). During each 

15 DNA synthetic process, the DNA template is copied once or 
at most a few times to produce identical replicas. In 
vitro DNA replication, in contrast, can be repeated many 
times, for example, during PGR. 

In the initial studies with PCR, the DNA 
20 polymerase was added at the start of each round of DNA 
replication. Subsequently, it was determined that 
thermostable DNA polymerases could be obtained from 
bacteria that grow at elevated temperatures, and these 
enzymes need to be added only once. At the elevated 
25 temperatures used during PCR, these enzymes would not 
denature. As a result, one can carry out ' repetitive 
cycles of polymerase chain reactions without adding fresh 
enzymes at the start of each synthetic addition process. 
The commercial market for the sale of DNA polymerases 
3 0 from thermostable organisms can be conservatively 
estimated at 200 million dollars per year. DNA 
polymerases, particularly thermostable polymerases,, are 
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the key to a large number of techniques in recombinant 
DNA studies and in medical diagnosis of disease. 



Due to the importance of DNA polymerases in 
biotechnology and medicine, it would be highly 
advantageous to generate DNA polymerases having desired 
enzymatic properties such as altered fidelity. However, 
the ability to predict the effect of introducing an amino 
acid mutation into the sequence of a protein remains very 
limited. Even when structural information is available 
for the protein of interest, it is often very difficult 
to predict the effect of mutations of specific amino acid 
residues on the function of that protein. In particular, 
it is extremely difficult to predict amino acid^ 
substitutions that will alter the activity of an enzyme 
to achieve a desirable change. 

Despite the limitations in predicting the 
effect of introducing amino acid substitutions into 
proteins, a number of mutant DNA polymerases have been 
discovered, or have been created by site-specific 
20 mutagenesis, and have been used in PCR -amplification 
{Tabor and Richardson, Proc. Natl. Acad. Sci. USA 
92:6339-6343 (1995)). Some of these mutant polymerases 
offer particular advantages with respect to 
thermostability, processivity, length of the newly 
25 synthesized DNA product, or fidelity of DNA synthesis. 
Those that are more accurate for the most part contain a 
3' -5* exonuclease activity that removes misincorporated 
bases prior to adding the next nucleotide during DNA 
synthesis. However, the current spectrum of mutant DNA 
30 polymerases is quite limited. For the most part, these 
mutants have been obtained by introducing a single base 
substitution at a specified site, purifying the enzyme 
and studying the changes in catalytic activity (Joyce and 

\ 
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Steitz, Annn. Rev. Riochem. 63:777-822 (1994)). These 
laborious and step-wise procedures have been necessary 
due to the lack of adequate knowledge to predict the 
effects of most single amino acid substitutions and due 
5 to the lack of rules for predicting the effects of 
multiple simultaneous substitutions. 

Thus, there exists a need for rapid and 
efficient methods to produce and screen for modified 
polymerases having desired fidelity in polynucleotide 
10 synthesis. The present invention satisfies this need and 
provides related advantages as well. 

SUMMARY O F THE INVENTION 

The present invention provides a method for 
identifying a thermostable polymerase having altered 

15 fidelity. -The method consists of generating a random 

population of polymerase mutants by mutating at least one 
amino acid residue of a thermostable polymerase and 
screening the population for one or more active 
polymerase mutants by genetic selection. For example, 

20 the invention provides a method for identifying a 
thermostable polymerase having altered fidelity by 
mutating at least one amino acid residue in an active 
site O-helix of a thermostable polymerase. The invention 
also provides thermostable polymerases and nucleic acids 

25 encoding thermostable polymerases having altered 

fidelity, for example, high fidelity polymerases and low 
fidelity polymerases. The invention additionally 
provides a method for identifying one or more mutations 
in a gene by amplifying the gene with a high fidelity 

30 polymerase. The invention further provides a method for 
accurately copying repetitive nucleotide sequences using 
a high fidelity polymerase mutant. The invention also 
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provides a method for diagnosing a genetic disease using 
a high fidelity polymerase mutant. The invention further 
provides a method for randomly mutagenizing a gene by 
amplifying the gene using a low fidelity polymerase 
5 mutant . 



BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 shows the nucleotide and amino acid 
sequence of Taq DNA polymerase I {SEQ ID N0S:1 and 2, 
respectively) . 

10 Figure 2 shows a compilation of amino acid 

substitutions identified in a screen of Taq DNA 
polymerase I mutants. Panel A shows single mutations, 
which were identified in the screen of a 9% library, 
listed under the wild type amino acids. Panel B shows 

15 the sequence of multiply substituted mutants identified 
in the screen of a 9% library. Panel C shows mutations 
selected from a totally random library of selected amino 
acids. 



Figure 3 shows the spectrum of single base 
2 0 changes generated in a forward mutation assay by Taq DNA 
polymerase I mutant Thr664Arg. 

DETAILED DESCRIPTION OF TH E INVENTION 

The invention is directed to methods for 
screening and identifying thermostable polymerases that 
25 have altered fidelity of DNA synthesis as well as to the 
resultant polymerase compositions. As disclosed herein, 
the invention provides rapid and efficient methods to 
identify polymerase mutants having altered fidelity. 
These methods are applicable to the identification of 
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polymerase mutants having a desired activity such as high 
fidelity or low fidelity. An advantage of .the methods is 
that they use a population of polymerase mutants to 
rapidly identify active polymerase mutants having altered 
5 fidelity. The identification of low fidelity mutants is 
useful for introducing mutations into specific genes due 
to the increased frequency of misincorporation of 
nucleotides during error-prone PCR amplification. The 
identification of high fidelity mutants is useful for PCR 

10 amplification of genes and for mapping of genetic 

mutations. The methods of the invention can therefore be 
advantageously applied to the identification of 
polymerase mutants useful for the characterization of 
specific genes and for the identification and diagnosis 

15 of human genetic diseases. 

As used herein, the term "polymerase" is 
intended to refer to an enzyme that polymerizes 
nucleoside triphosphates. Polymerases use a template 
nucleic acid strand to synthesize a complementary nucleic 

20 acid strand. The template strand and synthesized nucleic 
acid strand can independently be either DNA or RNA. 
Polymerases can include, for example, DNA polymerases 
such as Escherichia coli DNA polymerase I and Thermus 
aquaticus {Taq) DNA polymerase I, DNA-dependent RNA 

25 polymerases and reverse transcriptases. The polymerase 
is a polypeptide or protein containing sufficient amino 
acids to carry out a desired enzymatic function of the 
polymerase. The polymerase need not contain all of the 
amino acids found in the native enzyme but only those 

30 which are sufficient to allow the polymerase to carry out 
a desired catalytic activity. Catalytic activities 
include, for example, 5 '-3' polymerization, 5 '-3' 
exonuclease and 3 '-5' exonuclease activities. 
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As used herein, the term "polymerase mutant" is 
intended to refer to a polymerase that contains one or 
more amino acids that differ from a selected polymerase. 
The selected polymerase is determined based on desired 
5 enzymatic properties and is used as a parent polymerase 
to generate a population of polymerase mutants. A 
selected polymerase can be, for example, a wild type 
polymerase as isolated from an organism or can be a 
mutant polymerase that differs from a wild type 
10 polymerase by one or more amino acids and has desirable 
enzymatic properties. As disclosed he rein, a 
thermostable polymerase such -as Taq DNA polymerase I can 
be selected, for example, as a polymerase to generate a 
population of polymerase mutants. 



15 As used herein, the term "population" is 

intended to refer to a group of two or more different 
molecular species. . Molecular species differ by some 
detectable property such as a difference in at least one 
amino acid residue or at least one nucleotide residue or 

20 a difference introduced by the modification of an amino 
acid such as the addition of a chemical functional group. 
For example, a population of polymerase mutants would 
contain two or more different polymerase mutants. 
Typically, populations can be as small as two species and 

25 as large as 10 12 species. In some embodiments, 

populations are between about five and 20 different 
species as well as up to hundreds or thousands of 
different species. In other embodiments, populations can 
be, for example, greater than 10\ 10 5 and 10 6 different 

30 species. In the specific example presented in Example I, 
the population described therein is 50,000 different 
species. In yet other embodiments, populations are 
between about 10 6 -10 8 or more different species. Those 
skilled in the art will know a suitable size and 
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diversity of a population sufficient for a particular 
application. 

A population of polymerase mutants consists of 
two or more mutant polymerases which differ by at least 
5 one amino acid from the parent polymerase. A population 
of polymerase mutants can consist, for example, of 
multiple substitutions of a single amino acid residue 
where the substitutions are changes to any or all of the 
non-parental, naturally occurring amino acids at that 

10 amino acid position. In this example, the population 
would comprise nineteen members, and all members of the 
polymerase mutant population would consist of nineteen 
different amino acid substitutions at a single amino acid 
position. A population of polymerase mutants can also 

15 consist, for example, of at least one substitution at two 
or more different amino acid positions. In this example, 
a minimal population containing two polymerase mutants 
would consist of a single amino acid substitution at two 
different positions. Such a population can be expanded 

20 with the addition of substitutions to any or all of the 
19 non-parental amino acids at these two amino acid 
positions or additional amino acid positions. 

As used herein, the term "random" when used in 
reference to a population is intended to refer to a 

25 population of molecules generated without limiting the 
molecules to contain predetermined specific residues. 
Such a population excludes molecules in which a specific 
residue is substituted with a specific predetermined 
residue and individually assayed to determine its 

30 activity. The residues can be amino acid residues or 
nucleotide residues encoding a codon. The random 
molecules can be generated, for example, by introducing 
random nucleotides into an oligonucleotide sequence that 
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encodes an amino acid sequence of a protein region of 
interest (see Example I) . Thus, a random population is 
generated to contain random oligonucleotide sequences 
which can be expressed in appropriate cells to generate a 
5 random population of expressed proteins. A specific 

example of such a random population is the population of 
polymerase mutants described in Example I that were 
generated to screen for active polymerase mutants having 
altered fidelity. 

10 As used herein, the term "catalytic activity" 

or "activity" when used in reference to a polymerase is 
intended to refer to the enzymatic properties of the 
polymerase. The catalytic activity includes, for 
example: enzymatic properties such as the rate of 

15 synthesis of nucleic acid polymers; the K„ for substrates 
such as nucleoside triphosphates and template strand; the 
fidelity of template-directed incorporation of 
nucleotides, where the frequency of incorporation of 
non-complementary nucleotides is compared to that of 

20 complementary nucleotides; processivity, the number of 
nucleotides synthesized by a polymerase prior to 
dissociation from the DNA template; discrimination of the 
ribose sugar; and stability, for example, at elevated 
temperatures. Polymerases can discriminate between 

25 templates, for example, DNA polymerases generally use DNA 
templates and RNA polymerases generally use RNA 
templates, whereas reverse transcriptases use both RNA 
and DNA templates. DNA polymerases also discriminate 
between deoxyribonucleoside triphosphates and 

30 dideoxyribonucleoside triphosphates. Any of these 
distinct enzymatic properties can be included in the 
meaning of the term catalytic activity, including any 
single property, any combination of properties or all of 
the properties. Although specific embodiments 

\ 
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identifying polymerase mutants having altered fidelity 
are exemplified herein, the methods of the invention can 
similarly be applied to identify polymerases having 
altered catalytic activity distinct from altered 
5 fidelity. 



As used herein, the term "fidelity" when used 
in reference to a polymerase is intended to refer to the 
accuracy of template-directed incorporation of 
complementary bases in a synthesized DNA strand relative 

10 to the template strand. Fidelity is measured based on 
the frequency of incorporation of incorrect bases.: in the 
newly synthesized nucleic acid strand; The incorporation 
of incorrect bases can result in point mutations, 
insertions or deletions. Fidelity can be calculated 

15 according to the procedures described in Tindall and 

Kunkel ( Biochemistry 27:6008-6013 (1988)). Methods for 
determining fidelity are well known in the art and 
include, for example, those described, in Example III. A 
polymerase or polymerase mutant can exhibit either high 

20 fidelity or . low fidelity. As used herein, the term "high 
fidelity" is intended to mean a frequency of accurate 
base incorporation that exceeds- a predetermined value. 
Similarly, the term "low fidelity" is intended to mean a 
frequency of accurate base incorporation that is lower 

25 than a predetermined value. The predetermined value can 
be, for example, a desired frequency of accurate base 
incorporation or the fidelity of a known polymerase. 



As used herein, the term "altered fidelity" 
refers to the fidelity of a polymerase mutant that 
30 differs from the fidelity of the selected parent 

polymerase from which the polymerase mutant is derived. 
The altered fidelity can either be higher or lower than 
the fidelity of the selected parent polymerase. Thus, 
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polymerase mutants with altered fidelity can be 
classified as high fidelity polymerases or low fidelity 
polymerases. Altered fidelity can be determined by 
assaying the parent and mutant polymerase and comparing 
5 their activities using any assay that measures the 
accuracy of template directed incorporation of 
complementary bases. Such methods for measuring fidelity 
include, for example, those described in. Example. Ill as 
well as other methods known to those skilled in the art. 

10 As used herein, the term "immutable" when used 

in reference to an amino acid residue is intended to 
refer to an amino acid residue which cannot be 
substituted with another amino acid residue and still 
retain measurable function of the polypeptide. An 

15 immutable amino acid residue can be determined by 

introducing one or more substitutions of an amino acid 
residue and assaying the resulting mutant polypeptides 
for polypeptide function. An immutable residue can.be 
identified, for example, using site-directed mutagenesis 

20 to substitute each of the 19 non-parental amino acids at 
a given position and determining if any of these mutants 
are active. Random mutagenesis can also be employed to 
introduce substitutions of each of the nineteen, 
naturally occurring non-parental amino acids at a given 

25 position. Random mutagenesis can provide a statistical 
representation of all 20 amino acids at a given position. 
Sequencing of polymerase mutants allows determination of 
whether a given amino acid residue can tolerate any 
mutations. Assays for determining the function of mutant 

30 polypeptides include in vitro enzymatic assays as well as 
genetic complementation assays such as those described in 
Example I. If substitution of an amino acid residue with 
any other amino acid results in loss of polypeptide 
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function, then that amino acid residue is considered to 
be immutable. - 

As used herein, the term "nearly immutable" 
when used in reference to. an amino acid residue is 
5 intended to refer to an amino acid residue which can only 
tolerate conservative substitutions and still retain 
polypeptide function. Conservative amino acids are known 
to those skilled in the art and include those amino acids 
which have similar structure and chemical properties. 

10 Conservative substitutions of amino acids include, for 
example, the identification of amino acid substitutions 
based on the frequencies of amino acid changes between 
corresponding proteins of homologous organisms (Schulz 
and Schirmer, Principles of Protein Structure. Springer 

15 Verlag, New York {1979}). 

As used herein, the term "substantially" or 
"substantially the same" when used in reference to a 
nucleotide or amino acid sequence is intended to mean 
that the function of the polypeptide encoded by the 

20 nucleotide or amino acid sequence is essentially the same 
as the referenced parental nucleotide or amino acid 
sequence. For example, changes in a nucleotide or amino 
acid sequence that results in substitution of amino acids 
that differ from the parent molecule but that do not 

25 alter the desired activity of the encoded polypeptide 
would result in substantially the same sequence. A 
nucleotide or amino acid sequence is substantially the 
same if the difference in that sequence from the 
reference parental sequence does not result in any 

30 measurable difference in the desired activity of the 
encoded polypeptide. 
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The invention provides a method for identifying 
a thermostable polymerase having altered fidelity. The 
method consists of generating a random population of 
polymerase mutants by mutating at least one amino acid 
5 residue of a thermostable polymerase and screening the 
population for one or more active polymerase mutants by 
genetic selection. 



The generation and identification of 
polymerases having altered fidelity or altered catalytic 

10 activity is accomplished by first creating a population 
of mutant polymerases through random sequence mutagenesis 
of regions within the polymerase that can influence -the' 
fidelity of polymerization (Loeb, L.A., Adv. Pharmacol. 
35:321-347 (1996)). The identification of active mutants 

15 is performed in vivo and is based on genetic 

complementation of conditional polymerase mutants under 
non-permissive conditions. Once identified, the active 
polymerases are then screened for fidelity of 
polynucleotide synthesis. 

20 The methods of the invention employ a 

population of polymerase mutants and the screening of the 
polymerase mutant population to identify an active 
polymerase mutant. Using a population of polymerase 
mutants is advantageous in that a number of amino acid 

25 substitutions including single amino acid and multiple 

amino acid substitutions can be examined for their effect 
on polymerase fidelity. The use of a population of 
polymerase mutants increases the probability of 
identifying a polymerase mutant having a desired 

30 fidelity. 

Screening a population of polymerase mutants 
has the additional advantage of alleviating the need to 
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make predictions about the effect of specific amino acid 
substitutions on the activity of the polymerase. The 
substitution of single amino acids has limited 
predictability as to its effect on enzymatic activity and 
5 the effect of multiple amino acid substitutions is 

virtually unpredictable. The methods of the invention 
allow for screening a large number of polymerase mutants 
which can include single amino acid substitutions and 
multiple amino acid ' substitutions In addition, using 
10 screening methods that select for active polymerase 
mutants has the additional advantage of eliminating... 
inactive mutants, that could complicate screening .. 
procedures that require purification- of polymerase 
mutants to determine activity... - 

15 Moreover, the methods of the invention allow 

for targeting of amino acid residues adjacent to 
immutable or nearly immutable amino acid residues. 
Immutable or nearly immutable amino acid residues are 
residues required for activity, and those immutable 

20 residues located in the active site provide critical 
residues for polymerase activity. Mutating amino acid 
residues adjacent to these required residues provides the 
greatest likelihood of modulating the activity of the 
polymerase. Introducing random mutations at these sites 

25 increases the probability of identifying a mutant 

polymerase having a desired alteration in activity such 
as altered fidelity. 

A polymerase is selected as a parent polymerase 
to introduce mutations for generating a library of 
30 mutants. Polymerases obtained from thermophlic organisms 
such as Thermus aquaticus have particularly desirable 
enzymatic characteristics due to their stability and 
activity at high temperatures. Thermostable polymerases 
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are stable and retain activity at temperatures greater 
than about 37°C, generally greater than about 50°C, and 
particularly greater than about 90°C. The use of the 
thermostable polymerase Taq DNA polymerase I as a parent 
5 polymerase to generate polymerase mutants is disclosed 
herein (see Example I) . 

Although a specific embodiment using Tag DNA 
polymerase I is disclosed in the examples, the methods of 
the invention can similarly be applied to other 

10 thermostable polymerases other than Thermus aquaticus DNA 
polymerases. Such other polymerases include, for 
example, RNA polymerases from Thermus aquaticus and RNA 
and DNA polymerases from other thermostable bacteria. 
Using the guidance provided herein in reference to DNA 

15 polymerases, those skilled in the art can apply the 
teachings of the invention to the generation and 
identification of these other polymerases having altered 
fidelity of polynucleotide synthesis. 

In addition to creating mutant DNA polymerases 
20 from organisms that grow at elevated temperatures, the 

methods of the invention can similarly be applied to non- 
thermostable polymerases provided that there is a 
selection or screen such as the genetic complementation 
of a conditional polymerase mutation as described herein 
25 {see Example I) . Such a selection or screen of a non- 
thermostable polymerase can be, for example, the 
inducible or repressible expression of an endogenous 
polymerase.. Polymerases having altered fidelity can 
similarly be generated and selected from both prokaryotic 
30 and eukaryotic cells as well as viruses. Those skilled 
in the art will know how to apply the teachings described 
herein to the generation of polymerases having altered 
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fidelity from such other organisms and such other cell 
types. 

Thus, the invention provides a general method 
for the production of a polymerase that has an altered 
5 fidelity in DNA or RNA synthesis. The method consists of 
producing a population of sufficient size and diversity 
so as to contain at least one polymerase molecule having 
an altered fidelity and then screening that population to 
identify the polymerase having altered fidelity. The 
10 altered polymerase fidelity can be either an increase or 
decrease in the accuracy of DNA synthesis. 

In one embodiment, the invention involves the 
production of a relatively large population of randomly 
mutagenized nucleic acids encoding a polymerase and 

15 introduction of the population into host cells to produce 
a library. The mutagenized polymerase encoding nucleic 
acids are expressed, and the library is screened for 
active polymerase mutants by complementation of a 
temperature sensitive mutation of an endogenous 

20 polymerase. Colonies which are viable at the 
non-permissive temperature are those which have 
polymerase encoding nucleic acids which code for active 
mutants . 

To generate a random population of polymerase 
25 mutants, a random sequence of nucleotides is substituted 
for a defined target sequence of a plasmid-encoded gene 
that specifies a biologically active molecule. In one 
application of this procedure, a double-stranded 
oligodeoxyribonucleotide is provided by hybridizing two 
30 partially complementary oligonucleotides, one or both of 
which contain random sequences at specified positions. 
The partially double-stranded oligonucleotide is filled 
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-in by DNA polymerase, cut at restriction sites and 
ligated into a DNA vector. The plasmid encodes the gene 
for a thermostable DNA polymerase, and the 
oligonucleotide is inserted in place of a portion of the 
5 gene that modulates the fidelity of DNA synthesis. After 
ligation, the reconstructed plasmids constitute a library 
of different nucleic acid sequences encoding the 
thermostable DNA polymerase and polymerase mutants. 

As disclosed herein, a genetic screen can be 

10 used to identify active polymerase mutants having altered 
fidelity. The library of nucleic acid .sequences encoding 
polymerase and polymerase mutants are. trans fected into a 
bacterial strain such as E. coli strain recA718 polA12, 
which contains a temperature sensitive mutation in DNA 

15 polymerase. Exogenous DNA polymerases have been shown to 
functionally substitute for E,..coli DNA polymerase I 
using E. coli strain recA718 polA12 and to complement the 
observed growth defect at elevated temperature, 
presumably caused by the instability of the endogenous 

20 DNA polymerase I at elevated temperatures (Sweasy and 

Loeb, J. Biol. Chem. 267:1407-1410 (1992); Kim and Loeb, 
Proc. Natl. Acad. Sci USA 92:684-688 (1995)). It was 
unknown, however, whether a thermostable polymerase could 
substitute for E. coli DNA polymerase given the distinct 

25 and harsh environment experienced by thermophilic 

organisms in which enzymes must function at extremely 
high temperatures. As disclosed herein, wild type Taq 
DNA polymerase I was found to complement the growth 
defect of E. coli strain recA718 polA12 (see Example I). 

30 Using such a complementation system, various mutant Taq 
DNA polymerase I mutants were identified in host bacteria 
that harbor plasmids encoding active thermoresistant DNA 
polymerases that allowed bacterial growth and colony 
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formation at elevated (restrictive) temperatures (see 
Examples I and II) . 



The invention also provides a method for 
identifying a thermostable polymerase having altered 
5 fidelity. The method consists of generating a random 
population of polymerase mutants by mutating at .least one 
amino acid residue in an active site O-helix . of a 
thermostable polymerase, and, screening the population for 
one or more active polymerase mutants. 

10 The invention additionally provides a method 

for identifying a . thermostable polymerase having altered 
catalytic activity. The method consists of generating a 
random population of polymerase mutants by mutating at 
least one amino acid residue of a thermostable polymerase 

15 and screening the population for one or more active 
polymerase mutants. 

A random population of polymerase mutants is 
generated by mutating one or more amino acid residues in 
an active site O-helix target sequence of a thermostable 

20 polymerase. The O-helix has been postulated to interact 
with the substrate template complex {Joyce and Steitz, 
supra, (1994) ) . The O-helix has been observed in the 
crystal structure of E. coli DNA polymerase I Klenow 
fragment and Tag DNA polymerase (Beese et al., Science 

25 260:352-355 (1993); Kim et al . , Nature 376:612-616 

(1995)). As disclosed in Example II f random sequences 
were substituted for nucleotides encoding amino acids 
Arg659 through Tyr671 of the O-helix of Taq DNA 
polymerase I to generate a random population of 

30 polymerase mutants. 
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Using a genetic complementation screen, a 
variety of active Taq DNA polymerase I mutants were 
identified {see Example II). Several amino acid residues 
were found to be immutable or nearly immutable based on 
5 the complementation assay. These immutable or nearly 
immutable amino acid residues in the 0-helix are Arg659, 
Lys663, Phe667 and Tyr671. As used herein, a wild type 
amino acid is designated as a residue preceding the 
number of the amino acid position. A mutated amino acid 

10 is designated as a residue following the number of the 

amino acid position. These immutable or nearly immutable 
sites are unable to be altered and still maintain the 
function of the DNA polymerase. Due to their position in 
the active site O-helix of Tag DNA polymerase I, these 

15 immutable or nearly immutable residues provide critical 
residues that are required for the activity of the 
polymerase. 

In addition to the O-helix of a polymerase, 
other regions of the polymerase can be targeted for 

20 random mutagenesis to generate a library of polymerase 
mutants to identify polymerase mutants having altered 
fidelity. Those skilled in the art can determine other 
regions to target for mutagenesis. Such other regions 
can be identified, for example, by sequence homology to 

25 other polymerases, which suggests conservation of 
function. Conserved sequences can also be used to 
identify target regions for mutagenesis based on activity 
studies of other polymerases. Protein structural models 
revealing the convergence of amino acid residues at the 

30 active site of a polymerase can similarly be used to 
identify target regions for mutagenesis. 

Alternatively, mutagenesis throughout the 
polymerase can be used to identify amino acid residues 
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critical for polymerase function. Sequences containing 
these critical ' amino acid residues are target sequences 
for introducing random mutations to identify mutants 
having altered fidelity. Methods for identifying 
5 critical amino acid residues by introducing a small 

number of random mutations throughout a gene segment are 
well known to those skilled in the art and include, for 
example, copying by mutagenic polymerases, exposure of 
templates to DNA damaging agents prior to inserting into 

10 cells and replacement of regions of the DNA template with 
oligonucleotides containing sparsely populated random 
inserts. For example, a population of oligonucleotides 
with 91% correct substitutions and 3% of the^ 
non-complementary nucleotides at each position can be 

15 generated. Screening for polymerase mutants can, be 

performed, for example, with the genetic .complementation 
assay disclosed herein. 

The invention also provides a method for 
identifying a thermostable polymerase having altered 

20 fidelity. The method consists of generating a random 

population of polymerase mutants by mutating one or more 
amino acid residues adjacent to an immutable or nearly 
immutable residue in an active site O-helix of a 
thermostable polymerase and screening the population for 

25 one or more active polymerase mutants. 



In one embodiment, substitutions at amino acids 
adjacent to immutable or nearly immutable residues are 
used to identify polymerase mutants having altered 
fidelity. The adjacent amino acid residues can be 
30 immediately adjacent in the linear sequence or can be 

nearby. Adjacent residues that are nearby can be as many 
as two amino acids away from the immutable or nearly 
immutable residue in the linear sequence. A nearby 
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residue can also be nearby in the three-dimensional 
structure of the polymerase and can be determined from a 
crystallographic molecular model of a polymerase. Nearby 
residues are in close enough proximity to an immutable, or 
5 nearly immutable residue to modulate the activity of the 
polymerase. Generally, nearby residues are within two 
amino acid residues in the linear sequence from an 
immutable or nearly immutable residue or are within about 
5A of the immutable or nearly immutable, residues, in 
10 particular within about 

- Substitutions involving amino acid residues 
adjacent to immutable or nearly immutable sites have been 
found to alter the fidelity of DNA synthesis {see 
Examples IV and V) . The identified immutable or nearly 

15 immutable amino acid residues correspond to amino acid 
residues Arg659, Lys663, Phe667 and Tyr671 of Tag DNA 
polymerase I. Thus, the invention is directed to 
altering one or more amino acid residues adjacent to an 
amino acid residue corresponding to Arg659, Lys663, 

20 Phe667 or Tyr671 in Taq DNA polymerase. Amino acid- 
residues adjacent to these immutable residues include, 
for example, amino acids corresponding to Arg660, Ala661, 
Ala662, Thr664, Ile665, Asn666, Gly668, Val669 and Leu670 
in Taq DNA polymerase I. Corresponding residues in other 

25 polymerases are also included and can be identified based 
on sequence homology or based on corresponding amino 
acids in structurally similar domains as defined by a 
crystallographic molecular model. 

The methods of the invention are also directed 
30 to altering residues immediately adjacent to the 
immutable or nearly immutable residues. Thus, the 
methods of the invention are directed to altering 
residues adjacent to required residues on DNA polymerases 
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and identifying those mutations which have an effect on 
the fidelity of DNA synthesis. 

The invention further provides, methods for 
determining a fidelity of the active polymerase mutant. 
5 The fidelity of active polymerase mutants can be 

determined by several methods. The active polymerases 
can be, for example, screened for altered fidelity from 
crude extracts of bacterial cells grown from the viable 
colonies. Methods for determining fidelity of synthesis 

10 are disclosed herein (see Example III) . In one method, a 
primer extension assay is used with a biased ratio of 
nucleoside triphosphates consisting of only three of the 
nucleoside triphosphates. Elongation of the primer past 
template positions that are complementary to the deleted 

15 nucleoside triphosphate substrate in the reaction mixture 
results from errors in DNA synthesis. Processivity of 
high fidelity polymerases will terminate when they 
encounter a template nucleotide complementary to the 
missing nucleoside triphosphate whereas the low fidelity 

20 polymerases will be more likely to mis incorporate a non- 
complementary nucleotide. The accuracy of incorporation 
for the primer extension assay can be measured by 
physical criteria such as by determining the size or the 
sequence of the extension product. This method is 

25 particularly suitable for screening for low fidelity 
mutants since increases in chain elongation are easily 
and rapidly quantitated. 

A second method for determining the fidelity of 
polymerase mutants employs a forward mutation assay. A 
30 template containing a single stranded gap in a reporter 
gene such as lacZ is. used for the forward mutation assay. 
Filling in of the gapped segment is carried out by crude 
heat .denatured bacterial extracts harboring plasmids 
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expressing a thermostable DNA polymerase mutant. For 
determining ■ low fidelity polymerase mutants, reactions 
are carried out in the presence of equimolar 
concentrations of each nucleoside triphosphate. For 
5 determining high fidelity polymerase mutants, the 

reaction is carried out with a biased pool of nucleoside 
triphosphates. Using a biased pool of nucleoside 
triphosphates results in incorporation of errors in the 
synthesized strand that are proportional to the ratio of 

10 non-complementary to complementary nucleoside 

triphosphates in the reaction. Therefore, the bias 
exaggerates the errors produced by the polymerases and 
facilitates the identification of high fidelity mutants. 
The fidelity of DNA synthesis is determined from the 

15 number of mutations produced in the reporter gene. 

Procedures other than those described above for 
identifying and characterizing the fidelity of a 
polymerase are known in the art and can be substituted 
for identifying high or low fidelity mutants. Those 
20 skilled in the art can determine which procedures are 
appropriate depending on the needs of a particular 
application. 

Also provided herein is an isolated 
thermostable polymerase mutant having altered fidelity. 

25 The polymerase mutant has one or more mutated amino acid 
residues in the active site O-helix of a thermostable 
polymerase. Additionally provided is an isolated 
thermostable polymerase mutant having altered fidelity. 
The polymerase mutant has one or more mutated amino acid 

30 residues adjacent to an immutable or nearly immutable 
amino acid residue in the active site O-helix of a 
thermostable polymerase. The mutated amino acid residue 
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is adjacent to an amino acid residue corresponding to 
Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase. 

The invention also provides an isolated 
thermostable polymerase mutant having altered' fidelity, 
5 where the polymerase has one or more mutated amino acid 
residues adjacent to an amino acid residue corresponding 
to Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a high fidelity mutant. 

Using the methods of the invention, a ■ number of 
mutants have been identified "as having high' fidelity of 
DNA synthesis.- For example, polymerases having one or 
more single-base substitutions adjacent to Arg659, 
Lys663, Phe667, and Tyr671 in the nucleotide sequence of 
Taq DNA polymerase I have been identified. ^Specific 
examples of these high fidelity mutants include, for 
example, polymerases - having the single substitutions 
Asn666Asp, Asn666Ile, Ile665Leu, Leu670Val, Arg660Tyr 
Arg660Ser, Gly668Arg, Arg660Lys, Gly668Ser and Gly668Gln; 
polymerases having the double substitutions consisting of 
Thr664Ile 4 together with Asn666Asp, and Ala661Ser together 
with Val669Leu; as well as polymerases having the triple 
substitutions consisting of Thr664Pro, Ile665Val together 
with Asn666Tyr, and Ala661Glu, Ile665Thr together with 
Phe667Leu. Additional high fidelity mutants include, for 
example, Phe667Leu and Phe667Tyr. 

The invention provides a high fidelity 
polymerase mutant having one or more amino acid 
substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666Ile; Ile665Leu; Leu670Val; 
30 Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
Gly668Gln; Thr664Ile and Asn666Asp; Ala661Ser and 
Val669Leu; Ala661Glu, Ile665Thr, and Phe667Leu; and 
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Thr664Pro, Ile665Val and Asn666Tyr. The polymerase 
mutant Phe667Tyr has been previously described and is 
excluded from the compositions of the invention. 

The invention also provides an isolated 
5 thermostable polymerase mutant having altered fidelity, 
where the polymerase has one or more mutated amino acid 
residues adjacent to an amino . acid residue corresponding 
to Arg659, Lys663,. Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a low fidelity mutant. The invention 

10 additionally provides a low fidelity polymerase mutant 

having one or more amino acid substitutions selected from 
the group consisting of Ala661Glu; Ala66lPro; Thr664Pro; 
Thr664Asn; Thr664Arg; Asn666Val; Thr664Pro and Val669Ile; 
Arg660Pro and Leu670Thr; Arg660Trp and Thr664Lys; 

15 Ala662Gly and Thr664Asn; Ala661Gly and Asn666Ile; 
Ala661Pro and Asn666lle; and Ala661Ser, Ala662Gly, 
Thr664Ser and Asn666Ile. 

Low fidelity mutant DNA polymerases include 
mutations involving substitutions at Ala661, Thr664, 

20 Asn666, and Leu670. Specific examples of low fidelity 
mutants include, for example, polymerases having the 
single substitutions Ala661Glu, Ala661Pro, Thr664Pro, 
Thr664Asn, Thr664Arg and Asn666Val; polymerases having 
the double substitutions consisting of Thr664Pro together 

25 with Val669lle, Arg660Pro together with Leu670Thr, 

Arg660Trp together with Thr664Lys, Ala664Gly together 
with Thr664Asn, Ala661Gly together with Asn666Ile, and 
Ala661Pro together with Asn666Ile; as well as polymerases 
having four substitutions consisting of Ala661Ser, 

30 Ala662Gly, Thr664Ser together with Asn666Ile. 

For both the high fidelity and the low fidelity 
mutations described above, the invention provides 
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polymerases other than Taq DNA polymerase having 
mutations at corresponding positions. In particular, the 
invention provides thermostable polymerases other than 
Taq DNA polymerase that have mutations at corresponding 
5 positions and that have altered fidelity. Those skilled 
in the art can determine corresponding positions based on 
-sequence homology between the polymerases. 

The invention also provides an isolated nucleic 
acid molecule encoding a polymerase mutant having high 

10 fidelity. The nucleic acid molecule contains a 

nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I having one or more amino 
acid substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666lle; Ile665Leu; Leu670Val; 

15 Arg660Tyr; Phe667Tyr; Arg660Ser; Gly668Arg; Arg660Lys; 
Gly668S"er; Gly668Gln; Thr664Ile and Asn666Asp; Ala661Ser 
and Val~669Leu; Ala661Glu, Ile665Thr, and Phe667Leu; and 
Thr664Pro, Ile665Val and Asn666Tyr. 

Additionally provided is an isolated nucleic 
20 acid molecule encoding a polymerase mutant having low 
fidelity. The nucleic acid molecule contains a 
nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I having a substitution of 
one or more amino acids selected from the group 
25 consisting of Ala661, Thr664, Asn666 and Leu670. The 

invention also provides a polymerase mutant having one or 
more amino acid substitutions selected from the group 
consisting of Ala661Glu; Ala66lPro; Thr664Pro; Thr664Asn; 
Thr664Arg; Asn666Val; Thr664Pro and Val669lle; Arg660Pro 
30 and Leu670Thr; Arg660Trp and Thr664Lys; Ala664Gly and 
Thr664Asn; Ala661Gly and Asn666Ile; Ala661Pro and 
Asn666lle; and Ala661Ser, Ala662Gly, Thr664Ser and 
Asn666Ile. 
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■ The invention also provides methods for the 
identification of one or more mutations in a gene using 
the high fidelity mutant DNA polymerases of the 
invention. For example, the use of a high fidelity 
5 mutant to amplify a gene of interest gives greater 
confidence that the amplified sequence will more 
accurately reflect the actual sequence in the sample and 
minimizes the introduction of artifactual mutations 
during amplification of the gene. The higher accuracy of 

10 gene amplification provided by a high fidelity mutant 

also improves the identification of genetic mutations due 
to the increased confidence that observed mutations are 
more likely to reflect genetic mutations in the sample 
rather than artifactual mutations introduced during 

15 amplification. 

Additionally, the invention provides methods 
for identifying one or more mutations in a gene by 
amplifying the gene using a high-fidelity polymerase 
mutant under conditions which. allow polymerase chain 

20 reaction amplification. The gene is amplified by 

exposing the strands of the gene to repeated cycles of 
denaturing, annealing and elongation to produce an 
amplified gene product. Methods for amplifying genes 
using PCR are well known to those skilled in the art and 

25 include those described previously in PCR Primer, h 

Laboratory Manual . Dieffenbach and Dveksler, eds., Cold 
Spring Harbor Press, Plainview, New York (1995) . The 
presence or absence of one or more mutations in the gene 
can be determined by sequencing the amplified product 

30 using methods well known to those skilled in the art. 

The invention provides methods for accurately 
copying repetitive nucleotide sequences by amplifying the 
repetitive nucleotide sequence using a high fidelity 

1 
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polymerase mutant. The repetitive nucleotide sequence 
can be in a gene or in a microsatellite between genes. 
The methods of amplifying the repetitive nucleotide 
sequences are carried out under conditions which allow 
5 PCR amplification with repeated cycles of denaturing, 
annealing and elongation as described above. 



The high fidelity mutants of the invention are 
advantageous for copying repetitive - nucleotide sequences 
such as repetitive DNA because polymerases found in 

10 nature undergo . slippage when copying DNA containing 

repetitive sequences . Therefore when polymerases found 
in nature are .used, the amplification products of a 
nucleotide sequence containing a repetitive sequence do 
not accurately reflect the size or sequence of a DNA 

15 sequence in a sample. However, the use of a high 

fidelity polymerase mutant greatly increases the accuracy 
of an amplification product to reflect the actual size 
and sequence of the repetitive DNA sequence in the 
sample. Repetitive DNA can be found in microsatellites, 

20 which contain multiple repetitive nucleotide sequences 
and are dispersed throughout the genome. These 
repetitive di-, tri- and tetranucleotides are frequently, 
but not invariably, located between genes. 

The invention also provides a method for 
25 determining an inherited mutation by amplifying a gene 
using a high fidelity polymerase mutant. Such an 
inherited mutation can be correlated with a genetic 
disease, thereby allowing diagnosis of the genetic . 
disease. The invention additionally provides methods for 
30 diagnosing a genetic disease by amplifying a gene using a 
high fidelity polymerase mutant. A genetic disease is 
one in which a disease is caused by a genetic mutation in 
a coding or non-coding region of DNA. Such a genetic 
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mutation can be a somatic mutation or .a germline 
mutation. The methods of the invention can be used to 
diagnose any genetic disease using high fidelity 
polymerase mutants. Such genetic diseases can involve 
5 point mutations, insertions, and deletions. 

The methods of the invention employ high 
fidelity polymerase mutants and can similarly be used to 
diagnose genetic diseases involving repetitive DNA. In 
one embodiment, the genetic disease involves mutations in 

10 a microsatellite or repetitive DNA. Microsatellites are 
relatively stable in normal cells but are found to be 
unstable and to vary in length in some forms of 
hereditary and non-hereditary cancer, including 
hereditary nonpolyposis colorectal cancer (HNPCC) , other 

15 cancers that arise in HNPCC families, Muir-Torre syndrome 
and small-cell lung cancer (Loeb, ^nrer Res. 54:5059- 
5063 (1994); Brentnall, Am. J. Pathol. 147:561-563 
{1995); Honchel et al., Spmin. Cell Biol. 6:45-52 (1995); 
Eshleman and Markowitz, Curr. Qpin. Oncol. 7:83-89 

20 (1995)). Microsatellite instability appears to be 

confined to tumors and is not present in normal tissues 
of affected individuals. 



The accuracy of amplification products of 
repetitive DNA sequences provided by the high fidelity 

25 mutants of the invention can be used to diagnose diseases 
involving mutations in repetitive DNA sequences. For 
example, with tumor samples, the accurate amplification 
of repetitive DNA sequences can be used to diagnose those 
cancers involving variable length in microsatellite DNA. 

30 Since microsatellite instability appears to be confined 
to tumors, amplification of repetitive DNA using the high 
fidelity mutants of the invention can additionally be 
applied to determining the prognosis or extent of disease 
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of a cancer patient, evaluating outcomes of therapy, 
staging tumors and determining tumor status. High 
fidelity mutants of the invention can .also be applied to 
amplify DNA in blood samples to identify circulating 
5 cells containing microsatellite instability as an 
indicator of a cancerous state. 

Other genetic diseases also involve repetitive 
DNA sequences, in particular, unstable triplet repeats. 
These unstable triplet repeat diseases involve increasing 

10 lengths of triplet repeat regions, ranging from -50 

repeats in normal individuals, -200 repeats in carriers 
to -2000 repeats in affected individuals. Such unstable 
triplet repeat diseases include, for example, fragile X 
syndrome, spinal and bulbar muscular atrophy, myotonic 

15 dystrophy, Huntington's disease, spinocereballar ataxia 
type 1, fragile X E mild mental retardation and 
dentatorubral pallidoluysian atrophy (Monckton and 
Caskey, Circulation 91:513-520' (1995)). The diagnosis of 
unstable triplet repeat diseases is particularly valuable 

20 since the onset of symptoms can occur later in some 
diseases and the severity of the symptoms of some 
diseases can be correlated with the size of the extended 
triplet repeat region. Thus, amplification of these 
triplet repeat regions to more accurately reflect the 

25 actual size of the triplet repeat in the individual 
provides more accurate diagnosis and prognosis of the 
disease. Amplification of the large expanded regions 
associated with triplet repeat diseases can be carried 
out using low fidelity polymerase mutants of the 

30 invention since low fidelity polymerase mutants would be 
more likely to copy through very long stretches of 
repetitive nucleotide sequences. 
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One method for identifying a genetic disease 
involves utilization of primers that hybridize to 
specific genes. The primers contain 3' -terminal 
nucleotides complementary to the corresponding nucleotide 
5 in the mutant but not to the wild type gene. The 

mismatched primer is used to extend the primer template 
in the presence of a high fidelity mutant polymerase. 
The presence of an extension product is indicative of a 
mutant gene. 

10 The mismatch PCR method is based on the fact 

that a PCR primer that is not complementary to the 
template at the 3' end is an inefficient substrate for 
polymerases such. as Tag DNA polymerase I. Wild type Taq 
DNA polymerase will occasionally misextend a mismatched 

15 primer, resulting in a false positive in an assay for a 
gene, mutation. For example, a mutant gene with a rare TT 
mutation would be difficult to specifically amplify out 
of a pool of DNA molecules containing a wild type CC at 
the position of the TT mutant because wild type Taq DNA 

20 polymerase would occasionally misextend the wild type 
gene using the mismatched primer. In contrast, a high 
fidelity polymerase would not extend the mismatched 
primer. The products of a high fidelity polymerase in 
the mismatch PCR assay would therefore correspond to the 

25 mutant gene and would have fewer false positives than 
that observed with wild type Taq DNA polymerase. Thus, 
the more discriminating assay based on the use of high 
fidelity polymerases results in a better assay for 
detecting somatic mutations. The use of high fidelity 

30 mutants in such a mismatch-PCR based assay is disclosed 
herein (see Example V) . 

The invention also provides a method for 
randomly mutagenizing a gene by amplifying the gene using 
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the low fidelity polymerase mutants of the invention. 
The low fidelity polymerase mutants exhibit an efficiency 
of accurate base incorporation that is less than that of 
wild type polymerases. The efficiency of the low 
fidelity polymerase mutant is about 50%. or more, 
generally 10% or more, and particularly . 1% or more than 
that of a wild type, polymerase . These low fidelity 
polymerase mutants would therefore exhibit between 2-fold 
to 100-fold lower . fidelity than wild type polymerase. 
The introduction of mutations into specific genes using 
low fidelity polymerase mutants of the invention is 
useful for determining the effects of mutations on the 
function of those gene products. 

It is understood that modifications which do 
not substantially affect the activity of the various 
embodiments of this invention are also included within 
the definition of the invention provided-herein . 
Accordingly, the following examples are, intended to 
illustrate but not limit the present invention. 

EXAMPLE I 

Random Sequence Mutagenesis and Identification of Active 
Tag DNA Polymerase Mutants 

This example demonstrates random nucleotide 
sequence mutagenesis of a polymerase target sequence and 
identification of active polymerase mutants. 

Random sequence mutagenesis was used to 
introduce mutations into the 0-helix of Taq DNA 
polymerase. Briefly, the Taq DNA polymerase I gene was 
obtained from the bacterial chromosome by cloning in 
pKK223-3 {Pharmacia Biotech, Piscataway, NJ) . A 3.2-kb 
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fragment containing the Taq DNA polymerase I gene, 
including the 5' -3' exonuclease domain and the tac 
promoter region, was further transferred -into the Sail 
site of pHSG576 (pTacTaq) . The Taq DNA polymerase I gene 
5 was sequenced to confirm wild type sequence except for 
the lack of the N-terminal three amino acids. 

A vector containing a nonfunctional insert 
within the Taq DNA polymerase I gene was constructed and 
subsequently replaced with an oligonucleotide containing 

10 the random sequence to avoid contamination with 

incompletely cut vectors. To generate the nonfunctional 
vector, a SacII site was produced using site-directed 
mutagenesis by changing 2070C to G using a synthetic 
oligomer, 5 ' -GGG TCC ACG GCC TCC CGC GGG ACG CCG AAC ATC 

15 CAG CTG (SEQ ID NO: 3) (SacII-2) and the single-stranded 
plasmid pFC85 (Kunkel, Proc. Natl . Acad. Sci. USA 82:488- 
492 (1985)}. The BstXl-Nhel fragment that carries the 
SacII site was substituted for the corresponding fragment 
in pTacTaq (pTacTaqSac) . A SacII-Nhel fragment in 

20 pTacTaqSac was further replaced with the synthetic 

oligomer 5»-GGA CTG CAT ATG ACT G (SEQ ID NO: 4) (DUM-U) 
hybridized with 5 ' -CTA GCA GTC ATA TGC AGT CCG C 
(SEQ ID NO: 5) (DUM-D) to create the nonfunctional vector 
{Dube et al . , Biochemistry 30:11760-11767 (1991)). 

25 Oligonucleotides containing 9% random sequence, 

in which each nucleotide indicated in parentheses was 91% 
wild type nucleotide and 3% each of the other three 
nucleotides, were synthesized by Keystone Laboratories 
{Menlo Park, CA) : 0+9 RANDOM is 5 ' -CGG GAG GCC GTG GAC 

30 CCC CTG ATG (CGC CGG GCG GCC AAG ACC ATC AAC TTC GGG GTC 
CTC TAC) GGC ATG TCG GCC CAC CG (SEQ ID N0:6); 0-0 RANDOM 
is 5'-TGG CTA GCT CCT GGG AGA GGC GGT GGG CCG ACA TGC C 
(SEQ ID N0:7). The 17 nucleotide sequences at the 3' 
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ends of the two oligonucleotides are complementary. 
Equimolar amounts of these oligonucleotides (20 pmol) 
were mixed, hybridized, and extended by five cycles of 
PGR reaction {94°C for 30 sec, 57°C for 30 sec, and 72°C 
5 for 30 sec) in a 100 ul reaction mixture containing 10 mM 
Tris-HCl <pH 8.3), 50 mM KCl, 1.5 mM MgCl 2 , 0.001% 
gelatin, 50 uM dNTPs, and 2.5 units of Taq DNA polymerase 
I. This PCR product (10 ul) was further amplified 25 
cycles with 20 pmol of 0 (+) PRIMER (S'-TTC GGC GTC CCG CGG 

10 GAG GCC GTG GAC CCC CT) (SEQ ID NO: 8) and 20 pmol of 
O(-) PRIMER (5*-GTA AGG GAT GGC TAG CTC CTG 
GGA) (SEQ ID NO: 9) under the same conditions. .The 
amplified product was purified by phenol/chloroform 
extraction followed by ethanol precipitation and - 

15 digestion with the restriction enzymes, SacII and Nhel, 
at 37°C for 30 min in 50 mM Tris-HCl {pH 7.9)-, 50 mM 
NaCl, 10 mM MgCl 2 and 1 mM dithiothreitol . The 
restriction fragment containing the random sequence was 
purified by phenol/chloroform extraction, ethanol 

20 precipitation, and filtration using a Microcon 30 filter 
(Amicon, Beverly, MA) . For the totally random library, 
five oligonucleotides (80-mers), each having totally 
random sequence at one of the codons 659, 660, 663, 667 
or 668, were combined in equal amounts and hybridized to 

25 O-O RANDOM. After extension and digestion with 

endonucleases, the combined products were purified and 
processed as above. 

A random library of Taq DNA polymerase genes 
containing randomized nucleotide sequence corresponding 
30 to the O-helix was generated by digesting the vector 

containing the nonfunctional insert with Nhel and SacII 
restriction endonucleases. The large DNA fragment was 
isolated by electrophoresis in a 0.8% agarose gel and 
purified by using GenCleanll * (BiolOl, Vista, CA) . This 
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large fragment, lacking the nonfunctional insert, was 
ligated with an oligonucleotide containing randomized 
sequence by incubating overnight at 16°C with T4 DNA 
ligase. The ligation mixture was then used to transform 
5 DH5<x by electroporation according to Bio-Rad (Hercules, 
CA) . After electroporation, 1 ml of SOC (2% 
bactotryptone/O. 5% yeast extract/10 mM NaCl/2.5 mM KC1/10 
mM MgCl 2 /10 mM MgSO«/20 mM glucose) was added and 
incubation continued for 1 h at 37°C. An aliquot was 
10 plated on 2xYT (16 g/liter tryptone, 10 g/liter yeast 
extract, 5 g/liter NaCl, pH 7.3) containing 30 ug/ml 
chloramphenicol to determine the total number of 
transf ormants, and the remainder was inoculated into 500 
ml of 2xYT containing 30 ug/ml chloramphenicol and 
■ 15 cultured at 37°C overnight. Plasmids (random library 
vector) were purified and used for transformation of 
recA718 polA12 strain. 

For genetic complementation to determine active 
polymerase mutants, E. colx recA719 polA12 cells (SC18-12 

20 £. coli B/r strain, which has the genotype recA718 polA12 
uvrA155 trpE65 lon-11 sulAl) were transformed with 
plasmids pHSG576 or pTacTaq by electroporation (Bio-Rad 
Genepulser, 2kV, 25 uFD, 400 Q) (Sweasy and Loeb, supra, 
(1992); Sweasy and Loeb, Proc. Nat l. Acad. Sci . USA 

25 90:4626-4630 (1993); Witkin and Roegner-Maniscalo, i. 
Bacterid. 174 :4166-4168 (1992)). Thereafter, 1 ml of 
nutrient broth (NB) (8 g/liter) containing NaCl 
(4 g/liter) and 1 mM isopropyl p-D-thiogalactoside (IPTG) 
was added and the mixture was incubated for 1 h at. 37°C. 

30 The transformed cells were plated on nutrient agar plates 
(containing 23 g/liter Difco nutrient agar, 5 g/liter 
NaCl, 30 jig /ml chloramphenicol, 12.5 ug/ml tetracycline 
and 1 mM IPTG) and grown at 30°C overnight. Single 
colonies were transferred to NB for growth to logarithmic 



WO 98/23733 PCT/US97/21940 

37 

phase at 30°C. Thereafter, -10 ul (10 4 cells) was 
introduced at the center of an agar plate, and the 
inoculation loop was gradually moved from the center to 
the periphery as the plate was rotated. Duplicate plates 
5 were incubated at 30°C or 37°C for 30 h. To determine 
complementation efficiency by Taq DNA polymerase I and to 
isolate mutants, cultures of the recA718 polA12 strain 
harboring either pHSG576 or Taq DNA polymerase I were 
diluted with NB medium and plated (-500 colonies per 

10 plate) . Duplicate plates were incubated at 30°C or 37°C, 
and visible colonies were counted after a 30 h 
incubation. Complementation was verified by a second 
round of electroporation and colony formation at the 
nonpermissive temperature. Cell-free extracts were 

15 prepared from selected colonies obtained at the 

restrictive temperature and assayed to confirm that they 
contained a temperature-resistant DNA polymerase activity 
{Lawyer et al., J- Biol. Chem. 264:6427-6437 (1989)). 

Wild type Taq DNA polymerase I was tested for 
20 its ability to complement a temperature sensitive 
polymerase contained in the E. coli strain recA718 
polA12, which is unable to grow at 37°C in rich media at 
low cell density (Witkin and Roegner-Maniscalo, 1992, 
supra) . The temperature sensitive phenotype of E. coli 
25 strain recA718 polAl2 was complemented by transformation 
with the pTacTaq plasmid encoding wild type Taq DNA 
polymerase I as indicated by growth at 37°C. Therefore, 
this E. coli strain containing a temperature sensitive 
polymerase provides a good model system for testing Taq 
30 DNA polymerase I mutants. 



To evaluate the involvement of different amino 
acid residues in catalysis by Taq DNA polymerase I, 
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random sequences, were substituted for nucleotides 
encoding a portion of the substrate binding site of Tag 
DNA polymerase I (O-helix, amino acids Arg659 through 
Tyr671) . The substituted stretch was 39 nucleotides long 
5 with 9% randomization. At each position the proportion 
of the wild type residue was 91% and the other 3 
nucleotides were present in equal amounts (3% each) . 

A library of 50,000 independent mutants was 
obtained. The number of colonies obtained at 37 °C was 
10 11.8% of that obtained at 30°C. Therefore, screening a 
randomized library using E. coli strain recA718 polA12: 
provided approximately 5900 colonies containing active 
Tag- DNA polymerase and potential polymerase mutants." 

These results show that a randomized library 
15 can be used to generate a population of polymerase 

mutants. These results also show the identification of 
active Taq DNA polymerase I mutants by screening for 
active polymerase mutants using genetic selection. 

pXAMPLE XX 

20 Identification of Taa DNA Polyme rase I Mutants and 

Immutable or Nearly Immutable Am ino Acid Residues 

This example describes the identification Taq 
DNA polymerase I mutants generated by a randomized 
library and the identification of immutable or nearly 
25 immutable amino acid residues. 

The active Taq DNA polymerase I mutants 
identified by the screen described in Example I were 
further characterized. The entire random nucleotide- 
containing insert was sequenced from a total of 234 
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plasmids obtained at 37°C {positively selected) , 16 
plasmids obtained at 30°C (nonselected) and 29 plasmids 
obtained at 30°C, which failed to grow at 37°C (negatively 
selected) . All substitutions were in the randomized 
5 nucleotides except for 12 clones. 



Among the 230 positive plasmids, 168 contained 
silent mutations in one or more codons. At the amino 
acid level, 106 encoded the wild type residue and 124 
encoded substitutions, in accord with the expected 

10 distribution in the plasmid population. Of the 124 

plasmids with amino acid changes, 40 were unique mutants 
obtained just once. The remaining 84 plasmids 
represented 21 different mutants. At least 79% of those 
encoding the same amino acid substitutions were 

15 independently derived since they contained different 

silent mutations in other codons. In total, 61 different 
amino acid sequences were obtained that complemented the 
temperature-sensitive phenotype of the recA718 polA12 
host. 



20 A compilation of the amino acid substitutions 

found in Tag DNA polymerase I is shown in Figure 2. 
Solid boxes indicate the amino acid residues for which no 
substitutions were detected. Dashed boxes mark the amino 
acid positions where only conservative substitutions were 

25 found. The amino acid positions of Taq DNA polymerase I 
and corresponding positions of E. coli DNA polymerase I 
are indicated at the top. WT represents the wild type 
sequence and randomized amino acids are written in 
boldface type. The amino acids that have not been found 

30 in the DNA polymerase I family are outlined (Braithwaite 
and Ito, Nucleic Acids Res. 21:787-802 (1993)). Panel A 
shows single mutations selected from the 9% library 
listed under the wild type amino acids. Panel B shows 
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the sequence of each multiply substituted mutant selected 
from the 9% library. Panel C shows mutations selected 
from the totally random library. 

The distribution of single amino acid 
5 substitutions among the active mutants was not random 
{see Figure 2A) . . For example,' numerous diverse 
substitutions were observed at Ala661 and Thr664 . In 
contrast, no substitutions were detected at five 
positions (Arg659, Arg660, Lys663, Phe667 and Gly668) . 

10 This uneven distribution of replacements is unlikely to 
be the result of a bias in the nucleotide composition of 
the random insert since sequencing of both the 
nonselected and negatively selected plasmids revealed 
multiple nucleotide substitutions at each of the targeted 

15 positions and because silent mutations were detected at 
each of these positions in the selected clones. 

A nonrandom distribution of substitutions was 
also observed among active mutants containing multiple 
substitutions (see Figure 2B) . Again, Ala661 and Thr664 

20 were replaced with a variety of residues. However, no 
amino acid substitutions were observed in place of 
Arg659, Lys663 and Gly668, even though different silent 
nucleotide substitutions were found at each of these 
positions. A comparison of Figure 2A and B shows that 

25 substitutions at Arg660 and Phe667 occur only in the 
presence of substitutions at other positions. In 
addition to the mutants containing multiple substitutions 
shown in Figure 2B, two additional triple mutants were 
also found: mutant 44, with Ala661Pro, Thr664Arg, and 

30 Val669Leu; and mutant 54, with Ala661Thr, Thr664Pro and 
Ile665Val. 
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The partially substituted library (9%) does not 
provide a vigorous test of the immutability . of specific 
codons. Only 0.07% of sequences at each codon would be 
expected to contain nucleotide substitutions at all three 
positions. To further probe the mutability of specific 
amino acid residues, a second library was constructed 
that contained totally random substitutions at a limited 
number of designated codons. In this library, 
nucleotides encoding each of the five amino acids Arg659, 
Arg660, Lys663, Phe667 and Gly668 were randomized. These 
were amino acid positions that did not yield single 
substitutions in the 9% random library (Figure 2A) . 
Approximately 1300 transf ormants, which is 4 times more 
than the number required for each possible substitution 
at each of the target codons, were screened. At the 
nonpermissive temperature, 113 colonies were obtained, 84 
of which contained codons that encoded the wild type 
amino acid sequence. Most of the amino acid 
substitutions occurred in place of Arg660 or Gly668. 

Again, Arg659 and Lys663 were completely 
conserved, with 16 and 5 silent mutations scored at these 
codons, respectively. The expected number of silent 
mutations were 21 and 4.2, respectively, assuming that 
the 5 randomized oligomers that comprised the library 
were mixed in equimolar proportions. These numbers show 
that the oligomers were roughly equally represented in 
the library and that sufficient mutants were sampled to 
conclude that Arg659 and Lys663 are immutable in these 
genetic complementation experiments {P < 0.05 for Met and 
Trp, P < 0.01 for all other substitutions). Only Tyr 
substituted for Phe at position 667 (Figure 2C) , and six 
silent mutations were scored for this codon. An 
additional mutant obtained with the totally randomized 
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library but not shown in Figure 2 is mutant 601, with 
double substitutions Ile665Asn and Val669lle. 

These results show that generating a random 
library and screening by genetic complementation provided 
5 a number of active Taq DNA polymerase I mutants. These 
results also show that amino acid residues Arg659 and 
Lys663 were found to be immutable and Phe667 and Tyr671 
were found to tolerate only conservative substitutions. 



10 Determina tion of the Fidelity of Active Tag- mh 

Polymerase I Mutants 

This example describes methods of determining 
the fidelity of active Taq DNA polymerase I mutants. Two 
types of assays are useful for determining the fidelity 
15 of active polymerase mutants, a primer extension assay 
and a forward mutation assay. 

Crude extracts were used to determine the 
fidelity of polymerase mutants. A single" colony of 
E. coli DH5a (F - , <p80dlacZAMl5 , A(lacZYA-argF) U169, deoR, 

20 recAl, endAl, phoA, hsdRU {r/ja/) r supE44 , X' , thx-1, 
gyrA96, relAl) carrying wild type or mutant Taq DNA 
polymerase I was inoculated into 40 ml of 2xYT 
(16 g/liter tryptone, 10 g/liter yeast extract, 5 g/liter 
NaCl, pH 7.3) containing 30 mg/liter chloramphenicol. 

25 After incubation at 37 6 C overnight with vigorous shaking, 
an equal amount of fresh medium with 0.5 mM IPTG was 
added, and incubation was continued for 4 h. Cells were 
harvested, washed once with TE buffer (10 mM Tris-HCl, 
pH 8.0, 1 mM EDTA) and suspended in 100 ul of buffer A 

30 (50 mM Tris-HCl, pH 8.0, 2.4 mM phenylmethylsulf onyl 
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fluoride, 1 mM dithiothreitol, 0.5 mg/liter leupeptin, 
1 mM EDTA, 250 mM KCl) . Bacteria were lysed by 
incubating with lysozyme (0.2 mg/ml) at 0°C for 2 h. The 
lysate was centrifuged at 15,000 rpm (Sorvall, SA-600 
5 rotor) (DuPont, Newtown, CT) for 15 min, and the 

supernatant solution was incubated at 72°C for 20 min. 
Insoluble material was removed by centrif ugation . 

Polymerases were purified as described 
previously with some modifications {Lawyer et al., PCR 

10 Methods Ap nli cation 2:275-287 (1993). Briefly, a single 
colony of E. coli DH5a carrying wild type or mutant Tag 
DNA polymerase I was inoculated into 10 ml of 2xYT. Two 
ml of the inoculum was immediately added to each of 5 
■ bottles containing 1 liter of 2xYT with 30 mg/liter 

15 chloramphenicol. After overnight incubation at 37°C with 
vigorous shaking, 1 liter of 2xYT containing 30 mg/liter 
chloramphenicol and 0.5 mM IPTG was added, and incubation 
was continued for 4 h. Cells were harvested, washed once 
with TE buffer and suspended in 100 ml buffer A. 

20 Bacteria were lysed by incubating with lysozyme 

(0.2 mg/ml) at 0°C for 2 h and then sonicating on ice for 
45 sec by using a micro-tip probe (Sonifier, Branson 
Sonic Power, Danbury, CT) . 



The lysate was centrifuged at 15,000 rpm 
25 (Sorvall, SA-600 rotor) for 15 min, and the supernatant 
solution was incubated at 72°C for 20 min. Insoluble 
material was removed by centrif ugation. Ammonium sulfate 
(0.2 M) and Polymin P (0.6%) were added and the 
suspension was held on ice for 1 h. After removal of the 
30 precipitate by centrif ugation and filtration through a 
. Costar 8310 filter, the filtrate was applied to a 
3 x 8-cm phenyl -SEPHAROSE HP (Pharmacia Biotech) column 
equilibrated with buffer A containing 0.2 M ammonium 
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sulfate and 0.01% Triton X-100. The column was. washed ■ 
with the same buffer {300 ml) and activity was eluted 
with buffer B (TE buffer containing 0.01% .Triton X-100 
and 50 mM KC1) . The eluate (100 ml) was dialyzed 
5 overnight against 4 liters of buffer B and loaded onto a 
0.8 x 8-cm heparin-SEPHAROSE CL6B (Pharmacia Biotech) 
column equilibrated with buffer B. After washing with 
buffer B (50 ml), activity was eluted in a 30 ml linear 
gradient of 50-500 mM KC1 in TE buffer containing 0.01% 
10 Triton X-100. Active fractions were collected, dialyzed 
against 50 mM Tris-HCl (pH 8.0) containing 50 mM KC1 and 
50% glycerol, and stored at -80°C. 

To confirm and quantitate the presence of 
polymerase activity, crude extracts or purified enzyme 

15 was incubated at 72°C for 5 min in 50 mM Tris-HCl 

(pH 8.0,), 2 mM MgCl 2 , 100 uM each dATP, dGTP, dCTP and 
dTTP, 0.2 uCi of ( 3 H) dATP and 200 ug/ml activated calf 
thymus DNA. Incorporation of radioactivity into an acid- 
insoluble product was measured according to Battula and 

20 Loeb f J. Biol. Chem. 249:4086-4093 (1974). One unit 
represents incorporation of 10 nmol of dNMP in 1 h, 
corresponding to 0.1 unit as defined by Perkin-Elmer . 

For the primer extension assay, the 14-mer 
primer 5 1 -CGCGCCGAATTCCC {SEQ ID NO: 10) was 32 P-labeled at 
25 the 5 1 end by incubation with (y- 32 P)ATP and T4 

polynucleotide kinase and annealed to an equimolar amount 
of the template 4 6-mer 

5 ' -GCGCGGAAGCTTGGCTGCAGAATATTGCTAGCGGGAATTCGGCGCG 
(SEQ ID NO:ll) . Heat-inactivated E. coli extracts 
30 containing 0.3-1 unit of wild type or mutant Taq DNA 
polymerases were incubated at 45°C for 60 min in 50 mM 
Tris-HCl (pH 8.0), 2 mM MgCl 2 , 50 mM KC1, 20 pM each dATP, 
dGTP, dCTP and dTTP and 1.4 ng of the annealed template 
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primer. A set of four additional reactions, each lacking 
a different dNTP, was carried out for each polymerase. 
Purified enzyme {1 unit) was incubated for the times 
indicated under the same conditions as for crude 
5 extracts. After electrophoresis in a 14% polyacrylamide 
gel containing 8M urea, reaction products were analyzed 
by autoradiography. Extension was quantified by using an 
NIH imaging program (see http//www.nih . gov/) . 



For the forward mutation assay, the non-coding 

10 strand of the lacZa gene contained in 200 ng of gapped 
Ml3mp2 DNA was copied by using 5 units of wild type or 
mutant Tag DNA polymerase I in a reaction mixture 
containing 50 mM Tris-HCl (pH 8.0).,.. 2. mM MgCl 2 and 50 mM 
KC1 (Feig et al. Proc. Natl. Acad. Sci. USA 91:6609-6613 

15 (1994)). For determining low fidelity polymerase 

mutants, the reaction included 20 uM each dNTP. For 
determining high fidelity polymerase mutants, the 
reaction was carried out with biased dNTP pools 
containing 0.5 mM of one dNTP and 20 mM of each of the 

20 other three dNTPs. For example, the reaction could 
contain 0.5 mM dATP and 20 mM each of dGTP, dCTP and 
dTTP. After incubation at 72°C for 5 min, the DNA was 
transfected into host £. coli and the plaques were scored 
for white and pale blue mutant plaques (Tindall et al., 

25 Genetics 118:551-560 (1988)). 



These results show that the fidelity of active 
Tag DNA polymerase mutants can be determined using a 
primer extension assay and a forward mutation assay. 
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EXAMPLE IV 

Identification of Low Fidelity Taa DNA Polymerase I 
Mutants 

. This example shows the identification of low 
5 fidelity Taq DNA polymerase I mutants. 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify low fidelity 
mutants. Screening for activity was carried out on 67 of 

10 75 sequenced mutants, including all 38 with single amino 
acid substitutions described in Figure 2. Plasmids 
encoding the mutant polymerases were cloned, purified and 
grown in E. coli, and host cells were analyzed for 
expression of Taq DNA polymerase I by measuring the 

15 activity of crude extracts. E. coli DNA polymerases and 
nucleases were inactivated by heating at 72°C for 20 min. 
The ability of heat-treated extracts to elongate primers 
in the absence of a complete complement of four dNTPs was 
then determined using a set of five reactions. One 

20 reaction contained all four complementary nucleoside 

triphosphates while each of the others lacked a different 
dNTP ("minus conditions") . Elongation in the minus 
reactions is limited by the rate of misincorporation at 
template positions complementary to the missing dNTP. 

25 A primer extension assay was performed on wild 

type Taq DNA polymerase I and several mutants, revealing 
that several mutants had elongation patterns that 
differed from wild type Taq DNA polymerase. In the 
presence of all four dNTPs, every extract examined 

30 extended more than 90% of the hybridized primer to a 
product of length similar to that, of the template. In 



1 
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the minus reactions, wild type Taq DNA polymerase I 
extended 48-60% of the primer up to, but not opposite, 
the first template position complementary to the missing 
dNTP. The remaining primer was terminated opposite the 
5 missing dNTP, presumably by incorporation of a single 
non-complementary nucleotide, or was terminated further 
downstream, presumably by extension of the mispaired 
primer terminus. A variety of elongation patterns was 
observed for the 67 mutants. Thirteen mutants extended 

10 more of the primer and/or synthesized a greater 

proportion of longer products than the wild type enzyme 
in three or four of the minus reactions. For example, 
mutant 2 formed full-length products in reactions lacking. 
dGTP or dTTP. This increased extension presumably 

15 reflects increased incorporation and/or extension of 
non-complementary nucleotides. Other mutants extended 
less of the primer or synthesized shorter products than 
the wild type enzyme, for example, mutant 5. In several 
cases, different amino acid substitutions at the same 

20 position either increased or decreased extension in 
comparable minus reactions. 

A compilation of amino acid replacements in the 
13 mutants that displayed increased extension in at least 
three of the minus reactions is shown in Table I. With 

25 the exception of Gly668, one or more substitutions that 
putatively reduce the accuracy of DNA synthesis were 
observed for each of the 9 non-conserved amino acids. 
Eleven mutants harbored substitutions at either Ala661 or 
Thr664, including several single mutants. This initial 

30 screen with crude extracts suggested that a large number 
of changes are permitted in the 0-helix that do not 
reduce the ability of Taq DNA polymerase I to complement 
the growth defect of recA71S polA12. Many of the 
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Table I. Low Fidelity Mutants of Taq DNA Polymerase I 
Identified in the Primer Extension Screen 
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complementation reduce the fidelity of DNA synthesis in 
vitro. 



To demonstrate that the reduction in fidelity 
exhibited by crude extracts is due to mutant Taq DNA 

25 polymerase I, wild type enzyme was purified as well as 
the three single mutants Ala661Glu, Ala661Pro and 
Thr664Arg. The mutant Ile665Thr, a mutant predicted to 
have no alteration in fidelity based on complementation 
assays, was also purified as a control. The mutated 

30 enzymes retained at least 29% of wild type activity in 
vitro, which is in accord with their ability to 
complement the growth defect caused in E. coli by 



WO 98/23733 PCT/US97/21940 

49 

temperature-sensitive host DNA polymerase I and ensures 
that analysis of fidelity will not be complicated by 
major impairments of catalytic efficiency. 

Primer extension assays were carried out with 
5 the homogenous mutant polymerases. Wild type Taq DNA 
polymerase I extended most of the primer to one 
nucleotide before the template position opposite the 
missing complementary dNTP in a 5 min reaction. Only 
about 30% of the primers were elongated, further.. In 

10 reactions containing equivalent activity, the mutant 

polymerases Ala661Glu, Thr664Arg and Ala661Pro extended a 
larger proportion of the primers past the sites where the 
wild type polymerase ceased synthesis. The control 
enzyme Ile665Thr yielded an elongation pattern similar to 

15 that of the wild type enzyme. Elongation reactions with 
the three polymerases were also carried out for 60 min. 
Again, Ala661Glu and Thr664Arg synthesized a greater 
proportion of longer products than obtained with the wild 
' type and Ile665Thr polymerases. Notably, Ala661Glu, 

20 Thr664Arg and Ala661Pro synthesized longer products in 
5 min than the wild type did in 60 min. 

To further analyze the reduced fidelity 
exhibited by the low fidelity polymerase mutants, a time 
course of primer elongation was carried out. Wild type 

25 Taq DNA polymerase I extended 9% of the primers past the 
first deoxyguanosine template residue within the 60 min 
incubation period, but elongation past the second 
deoxyguanosine was not detected. In the same interval, 
Thr664Arg extended 93% of the primer past the first 

30 template deoxyguanosine, and elongation proceeded past as 
many as five template deoxyguanosines . Importantly, a 
comparable proportion of primers was extended at all time 
points, despite the striking difference in the length of 



WO 98/23733 PCT/US97/21940 

50 

the products. These time course data indicate that 
greater elongation reflects increased ability to utilize 
non-complementary substrates and primer termini,, rather 
than a putative difference in the amount of activity 
5 present. 

In a forward mutation assay, the fidelity of 
DNA synthesis by the purified polymerases was quantitated 
by measuring the frequency of mutations produced by 
copying a biologically active template in vitro (Kunkel 

10 and Loeb, .t. Rirvl. Chem 254:5718-5725 (1979)). The 
target sequence was the lacZa gene located within a 
single-stranded region in gapped circular double-stranded 
M13mp2 DNA {Feig and Loeb, Biochemistry 32:4466-4473 
(1993)). The gapped segment was filled by synthesis with 

15 the wild type or mutant enzymes. The double-stranded 
circular product was transfected into E. coli, and the 
mutation frequency was determined by scoring white and 
pale blue mutant plaques. A comparison of the specific 
activities and mutation frequencies . of the purified 

20 enzymes is presented in Table II. After synthesis by 
wild type Taq DNA polymerase I, the mutation frequency 
was not greater than that of the uncopied control. 
Synthesis by Ala661Glu and Thr664Arg gave rise to 
mutation frequencies more than 7- and 25-fold greater, 

25 respectively, than that of the wild type polymerase. 

A sample of independent, randomly chosen 
mutants produced by Thr664Arg was characterized by DNA 
sequence analysis using a THERMO SEQUENASE cycle 
sequencing kit (Amersham Life Science, Cleveland, OH) . 
30 Both base substitutions and frameshifts were found 
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Table II. Mutation Frequency in the lacZa Forward 
Mutation Assay 

Taq Pol I Specific Plagues Scored Mutation 

Activity Total Mutant Frequency 

5 





units /mg 
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10 throughout the targeted lacZa gene and its regulatory 
sequence. Of the 64 independent plaques, 57 had 
mutations in the target. Other mutations presumably 
occurred outside the target region. Some had more than 
one base substitution and a total of 66 mutations were 

15 observed (see Figure 3) . Among them, 61 were base 

substitutions. Transitions (38/61) were more frequent 
than transversions (23/61). T - C transitions accounted 
for 31 of 61 base substitutions, while T - A (9/61), A - 
T (8/61) and G - A (5/61) substitutions were less 

20 frequent. This base substitution spectrum is essentially 
the same as that reported for wild type Taq DNA 
polymerase I (Tindall and Kunkel, supra, 1988) . From 
these data, the base- substitution fidelity of Thr664Arg 
can be calculated as 8.6 x 10" 4 or 1 error per 1200 

25 nucleotides. On the basis of the five frameshift mutants 
detected, the frameshift error can be calculated as 4.9 x 
10" 5 or 1 error per 20,000 nucleotides. 

These results show that low fidelity Taq DNA 
polymerase I mutants were identified from a randomized 
30 library using a genetic complementation screen. The 

fidelity of Taq DNA polymerase I mutants was determined 
by primer extension assays and forward mutation assays. 
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EXAMPLE V 

Identification of High Fi delity Tag DNA Polymerase I 

Mutants 

This example shows the identification of high 
5 fidelity Taq DNA polymerase I mutants. 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify high fidelity 

Table III . Candidate High Fidelity Mutants of 
10 Taq DNA Polymerase I 



659 663 667 671 

WT: RRAAKTINFGVLY 



FL : L 
15 74 : E T L 

146 : D 

147 : I 
149 : ID 

169 : S L 

20 186 : L 

219 : P V Y 

254 : V 

407 : Y 

424 : Y 
25 426 : S 

487 : r 

488 : K 

530 : S 
614 : Q 
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mutants. A panel of 75 active polymerases was screened. 
Candidate high fidelity polymerase mutants are shown in 
Table III. 

Thirteen of the active polymerases exhibited greater 
5 accuracy in' DNA synthesis. Table IV summarizes the 

results of a forward mutation assay of some of these high 
fidelity mutants. Several polymerase mutants displayed 
higher fidelity .than the wild type Taq DNA polymerase. 
Polymerase mutants exhibiting particularly high fidelity 
10 are mutant 424, with Phe667Tyr, mutant 426, with 
Arg660Ser and mutant 488, with Arg660Lys. 

Table IV. Fidelity of Taq DNA Polymerase Mutants in a 
lacZ Forward Mutation Assay 



Enzyme Total Mutant Mutation 

15 Plaques Plaques Frequency 

__ _ 

Wild Type 5680 4 9 8.6 

High Fidelity Mutants 

20 MS147 7249 47 6.5 

MS169 7275 34 5.1 

MS254 6898 40 5.8 

MS424 4810 14 2.7 

MS426 5727 23 4.1 

25 MS488 3442 13 1.5 

Low Fidelity Mutant 

MS206 3333 133 40 
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These results show that Taq DNA polymerase 
mutants were identified and found to exhibit higher 
fidelity than wild type Taq DNA polymerase. 

EXAMPLE VI 

5 High Fidelity Taq DNA Polymerase M utants Enhance the 

Sen sitivity of Mismatch PCR-b ased Assays for Somatic 
Mutations 



This example shows the use of high fidelity 
mutants obtained by mutating the active site O-helix of 
10 Taq DNA polymerase I to enhance the sensitivity of 
mismatch PCR-based assays for somatic mutations. 

Mismatch PCR is the basis of allele-specif ic 
identification of inherited mutations within genes and 
somatic mutations that occur in tumors. In these 

15 studies, one compares the . extension of a correctly 

matched primer -with the lack of extension using a primer 
with a 3 '-terminal mismatch. The rate of extension by 
DNA polymerase using a primer with a single mismatch 
compared to a primer with a 3 ' -complementary base pair 

20 (matched) terminus is approximately 10" s (Perinno and 

Loeb, .T. Riol. Chem. 262:2898-2905 (1989)). Elongation 
from a double mismatch is even less frequent, and thus 
offers an even more stringent test of the inability of 
mutant Taq DNA polymerases to elongate a mismatched 

25 primer terminus. 

A template containing the wild type sequence of 
human DNA polymerase-3 at nucleotide positions 886-889 
fCCCCTGGG) was utilized. PCR reactions were carried out 
with two complementary primers that flank the sequence 
30 (matched) or with one matched template and a second 
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mismatched template containing a terminally mismatched 
primer with AA at the 3' terminal position. The AA would 
be across from the CC (underlined) in the template 
strand. In these studies, the ratio of templates 
5 containing the complementary and non-complementary 
sequences were varied. The PCR amplified product was 
separated by polyacrylamide gel electrophoresis and 
quantitated by phosphoimage analysis. Wild type Taq DNA 
polymerase detected one molecule of template containing a 

10 TT substitution in place of the two template CC when 
present in a population of 10 s molecules containing the 
non-mutant templates with the CC substitution. In • 
contrast, both of the high fidelity Taq DNA polymerase 
mutants, with substitutions Phe667Tyr and Arg659Ser, 

15 detected one molecule of the TT template amongst 10 8 
molecules of the CC template when the primer Contained 
two terminal 3 1 -AA nucleotide residues. 

These results show that high fidelity Taq DNA 
polymerase mutants have two to three orders of magnitude 
20 enhanced sensitivity for detecting mutant DNA using a 
mismatch PCR-based assay. 

EXW?^E V2X 

High Fidelity Tag DNA Polymerase Mutan ts Enhance 
Sensitivi t y of Detection of Repetitive DNA Sequences 



25 This example demonstrates the use of high 

fidelity polymerase mutants to enhance the sensitivity 
and accuracy of amplifying repetitive DNA sequences. 

Detection of the length of unstable 
microsatellite DNA in certain human tumors has depended 
30 on PCR amplification of specific sequences and 

determination of changes in electrophoretic mobility in 
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gels. Due to the slippage of DNA polymerase while 
copying repetitive DNA, the interpretation of the results 
of this method have remained unsatisfactory. 

High fidelity Tag DNA polymerases are 
identified using the methods described in Examples I and 
III. DNA templates containing runs of CA repeats with 
the number of repeats varying from 5 to 50 are used to 
test high fidelity Taq DNA polymerase mutants. After 20 
to 70 rounds of PCR amplification, the product of the 
reaction is displayed on polyacrylamide gels. High 
fidelity polymerase mutants which display less slippage 
errors copying the repetitive sequences are identified. 
These high fidelity polymerase mutants are used to 
amplify repetitive DNA sequences in samples, for example 
tissue or tumor samples. 

These results show that high fidelity mutants 
having enhanced sensitivity and accuracy in amplifying 
repetitive DNA sequences can be identified and used to 
amplify repetitive DNA in tissue or tumor samples. 

Throughout this application various 
publications have been referenced. The disclosures of 
these publications in their entireties are hereby 
incorporated by reference in this application in order to 
more fully describe the state of the art to which this 
invention pertains. 

Although the invention has been described with 
-reference to the disclosed embodiments, those skilled in 
the art will readily appreciate that the specific 
experiments detailed are only illustrative of the 
invention. It should be understood that various 
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modifications can be made without departing from the 
spirit of the invention." 
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1. A method for identifying a thermostable 
polymerase having altered fidelity, comprising generating 
a random population of polymerase mutants by mutating one 

5 or more amino acid residues adjacent to an immutable or 
nearly immutable residue in an active site O-helix of a 
thermostable polymerase and screening said population for 
one or more active polymerase mutants. 

2. The method of claim 1, further comprising 
10 determining a fidelity of said active polymerase mutant. 

3. The method of claim 1, wherein said one or 
more amino acid residues is immediately adjacent to an 
immutable or nearly immutable residue. 

4. The method of claim 1, wherein said one or 
15 more amino acid residues is adjacent to an amino acid 

residue corresponding to Arg659, Lys663, Phe667 or Tyr671 
in Taq DNA polymerase. 

5. The method of claim 4, wherein said 
thermostable polymerase is Taq DNA polymerase. 

20 6 . An isolated thermostable polymerase mutant 

having altered fidelity, wherein said polymerase mutant 
comprises one or more mutated amino acid residues 
adjacent to an immutable or nearly immutable residue in 
the active site. O-helix of a thermostable polymerase. 

25 7. The polymerase mutant of claim 6, wherein 

said polymerase is Taq DNA polymerase. 
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8. The polymerase mutant of claim 6, wherein 
said one or more amino acid residues is immediately 
adjacent to an immutable or nearly immutable residue. 

9. The polymerase mutant of claim 6, wherein 
5 said mutated amino acid residue is adjacent to an amino 

acid residue corresponding to Arg659, Lys663, Phe667 or 
Tyr671 in Tag DNA polymerase. 

10. The polymerase mutant : of -claim 9, wherein 
said polymerase is Taq DNA polymerase. 

!0 ll. The polymerase mutant of claim 7, wherein 

said mutant is a high fidelity mutant. 

12. . The polymerase mutant of claim 11, wherein 
said polymerase mutant comprises one or more amino acid 
substitutions selected from the group consisting of 

15 Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
and Gly668Gln. 

13. The polymerase mutant of claim 7, wherein 
said mutant is a low fidelity mutant. 

14. The polymerase mutant of claim 13, wherein 
20 said polymerase mutant comprises substitution of one or 

more amino acids selected from the group consisting of 
Ala661, Thr664, Asn666 and Leu670. 

15. An isolated nucleic acid molecule encoding 
a polymerase mutant having high fidelity, comprising a 

25 nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I comprising one or more 
amino acid substitutions selected from the group 
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consisting of Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; 
Gly668Ser; and Gly668Gln. 

16. An isolated nucleic acid molecule encoding 
a polymerase mutant having low fidelity, comprising, a 1 

5 nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I comprising substitution 
of one or more amino acids selected from the group 
consisting of Ala661, Thr664, Asn666 and Leu670. 

17. A method for identifying one or more 
10 mutations in a gene, comprising amplifying said gene 

using a high fidelity polymerase mutant under conditions 
which allow polymerase chain reaction amplification. 

18. A method for identifying one or more 
mutations in -a gene, comprising amplifying said gene 

15 using the high fidelity polymerase mutant of claim. 11 
under conditions which allow polymerase chain reaction 
amplification. 

19. The method of claim 17, wherein said gene 
is amplified by exposing the strands of said gene to 

20 repeated cycles of denaturing, annealing and elongation 
to produce an amplified product. 

20. The method of claim 19, further comprising 
determining the presence or absence of one or more 
mutations in the sequence of said gene. 

25 21. The method of claim 17, wherein said 

polymerase mutant comprises one or more amino acid 
substitutions selected from the group consisting of 
Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
and Gly668Gln. 
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22. A method for accurately copying repetitive 



nucleotide sequences, comprising amplifying said 
repetitive nucleotide sequence using a high fidelity 
polymerase mutant. 



5 



23. The method of claim 22, wherein said 
repetitive nucleotide sequence is in a gene; 



24. The method of claim 22, wherein said 
repetitive nucleotide sequence is in a microsatellite 
between genes. 



10 



25. A method for accurately copying repetitive 



nucleotide sequences, comprising amplifying said 
repetitive nucleotide sequence using said high fidelity 
polymerase mutant of claim 11. 

26. A method for determining an inherited 
15 mutation, comprising amplifying a gene using a high 

fidelity polymerase mutant. 

27. A method for diagnosing a genetic disease, 
comprising correlating the inherited mutation determined 
in claim 26 with said genetic disease. 

20 28. A method for diagnosing a genetic disease, 

comprising amplifying a gene using a high fidelity 
polymerase mutant. 

29. A method for diagnosing a genetic disease, 
comprising amplifying a gene using said high fidelity 
25 polymerase mutant of claim 11. 
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30. The method of claim 28, wherein said 
genetic disease comprises mutations in microsatellite or 
repetitive DNA. 



5 genetic disease is cancer. 

32. A method for determining the prognosis of 
a genetic disease, comprising amplifying said gene in 
claim 28. 



10 polymerase mutant comprises one or more. amino acid 
substitutions selected from the group consisting of 
Arg660Tyr; Arg660Ser; Gly668Arg,,* Arg,660Lys; Gly668Ser; 
and Gly668Gln. 

34. A method for randomly mutagenizing a gene, 
15 comprising amplifying said gene using a low fidelity 

polymerase mutant. 

35. A method for randomly mutagenizing a gene, 
comprising amplifying said gene using said low fidelity 
polymerase mutant of claim 13. 

20 36. The method of claim 35, wherein said 

polymerase mutant comprises substitution of one or more 
amino acid residues selected from the group consisting of 
Ala661, Thr664, Asn666 and Leu670. 



31. 



The method of claim 30, wherein said 



33. 



The method of claim 28, wherein said 
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AAGCTCAGAT CTACCTGCCT GAGGGCGTCC GGTTCCAGCT GGCCCTTCCC GAGGGGGAGA 60 

GGGAGGCGTT TCTAAAAGCC CTTCAGGACG CTACCCGGGG GCGGGTGGTG GAAGGGTAAC 120 

CCC AAG GGC CGG GTC CTC CTG 168 
Pro Lys Gly Arg Val Leu Leu 
10 15 

GTG GAC GGC CAC CAC CTG GCC TAC CGC ACC TTC CAC GCC CTG AAG GGC 216 

" Leu 

30 



GGG ATG 
Gly Met 


CTG 
Leu 
5 


CCC 
Pro 


GGC CAC 
Gly His 
20 


CAC 
His 


CTG 
Leu 


ACC AGC 
Thr Spr 
35 


CGG 

A ra 


GGG 

Gl V 


CTC CTC 
i pi i i pi i 

LCU LCU 


AAG 


GCC 

Mid 


GAC GCC 
Asp Ala 


AAG 
Lys 


GCC 
Ala 
70 


GCG GGC 
Ala Glv 


CGG 
Arg 
85 


GCC 
Ala 


ATC AAG 
lie Lys 
100 


GAG 
Glu 


CTG 

uCU 


GGC TAC 
uiy i y i 
115 


GAG 
ft in 

V3 1 U 


GCG 

Mid 


AAG GAG 
Lys Glu 


GGC 
Gly 


TAC 
Tyr 


CAG CTC 
Gin Leu 


CTT 
Leu 


TCC 
Ser 
150 


ATC ACC 
lie Thr 


CCG 
Pro 
165 


GCC 
Ala 


TGG GCC 
Trp Ala 
180 


GAC 
Asp 


TAC 
Tyr 


GGG GTC 
Gly Val 
195 


AAG 
Lys 


GGC 
Gly 



r^ Thr Phe His Ala Leu Lys Gly 



Tyr Gly Phe Ala 



40 

t\AG 

. Lys 

50 55 60 

TTT GAC GCC AAG GCC CCC TCC TTC CGC CAC 
Phe Asp Ala Lys Ala Pro Ser Phe Arg His 
65 70 75 

CCG GAG GAC TTT CCC CGG CAA CTC 
Pro Glu Asp Phe Pro Arg Gin Leu 
90 95 

CTC CTG GGG CTG GCG CGC CTC GAG 
Leu Leu Gly Leu Ala Arg Leu Glu 
105 110 

CCG GGC TAC GAG GCG GAC GAC 

Val 

" ' 120 125 



170 175 



185 190 



Gly Glu Lys Thr Ala Arg Lys Leu Leu 
200 205 



264 
312 
360 
408 
456 
504 
552 



130 135 140 

CTT TAC CAG CTC CTT TCC GAC CGC ATC CAC GTC CTC CAC CCC GAG GGG 600 

Leu Tyr Gin Leu Leu Ser Asp Arg He His Val Leu His Pro Glu Gly 

145 150 155 160 



648 
696 
744 
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GAG GAG 
Glu GLu 
210 


TGG 
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GGG AGC 
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CTG 
Leu 
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Glu 
215 
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Lys Pro 
225 
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ATC CGG 
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AAG 
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1 pi) ^pr 


TGG 
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CGG 

A ra 
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CTG GAG 
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AGG 
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275 


CTT GAG 
Leu Glu 


TTT 
Phe 


GGC 
Gly 


GAA AGC 
Glu Ser 
290 


CCC 
Pro 


AAG GCC 
Lys Ala 


CTG 
Leu 


GAG 
Glu 
295 


GCC TTC 

Ala Php 
Mia rile 

305 


GTG 

Va 1 
vd l 


GGC TTT 

Glv Php 
uiy rut? 


GTG 

Vdl 

310 


CTT 

1 All 

LCU 


CTT CTG- 
i pii i pi t 


GCC 
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n Id 


CTG GCC 

1 Pit Ala 

325 


GCC 
Ala 

n Id 


GCC 

Ala 
nid 


GAG CCT 
Glu Pro 


TAT 

T\/r 

iyr 


AAA GCC 

1 v/c Ala 
LyS Aid 

340 


CTC 
Leu 


AGG 
Arg 


GCC AAA 

Mid Lys 


GAC 
Asp 
355 


CTG AGC 
Leu Ser 


GTT 
vai 


CTG 
Leu 


CCC GGC 
Pro Gly 
370 


GAC 
Asp 


GAC CCC 
Asp Pro 


ATG 
Met 


CTC 
Leu 
375 


ACC ACC 
Thr Thr 
385 


CCC 
Pro 


GAG GGG 
Glu Gly 


GTG 
Val 
390 


GCC 
Ala 


GAG GCG 
Glu Ala 


GGG 
Gly 


GAG CGG 
Glu Arg 
405 


GCC 
Ala 


GCC 
Ala 


TGG GGG 
Trp Gly 


AGG 
Arg 


CTT GAG 
Leu Glu 
420 


GGG 
Gly 


GAG 
Glu 



220 

ATC CTG GCC CAC ATG GAC GAT CTG AAG 

He Leu Ala His Met Asp Asp Leu Lys 

235 240 

GTG CGC ACC GAC CTG CCC CTG GAG GTG 
Val Arg Thr Asp Leu Pro Leu Glu Val 
250 255 

CCC GAC CGG GAG AGG CTT AGG GCC TTT 
Pro Asp Arg Glu Arg Leu Arg Ala Phe 
265 270 

AGC CTC CTC CAC GAG TTC GGC CTT CTG 
Ser Leu Leu His Glu Phe Glu Leu Leu 
280 285 

GAG GCC CCC TGG CCC CCG CCG GAA GGG 
Glu Ala Pro Trp Pro Pro Pro Glu Gly 
300 

TCC CGC AAG GAG CCC ATG TGG GCC GAT 
Ser Arg Lys Glu Pro Met Trp Ala Asp 
315 320 

AGG GGG GGC CGG GTC CAC CGG GCC CCC 
Arg Gly Gly Arg Val His Arg Ala Pro 
330 335 

GAC CTG AAG GAG GCG CGG GGG CTT CTC 
Asp Leu Lys Glu Ala Arg Gly Leu Leu 
345 350 

GCC CTG AGG GAA GGC CTT GGC CTC CCG 
Ala Leu Arg Glu Gly Leu Gly Leu Pro 
360 365 

CTC GCC TAC CTC CTG GAC CCT TCC AAC 
Leu Ala Tyr Leu Leu Asp Pro Ser Asn 
380 

CGG CGC TAC GGC GGG GAG TGG ACG GAG 
Arg Arg Tyr Gly Gly Glu Trp Thr Glu 
395 400 

CTT TCC GAG AGG CTC TTC GCC AAC CTG 
Leu Ser Glu Arg Leu Phe Ala Asn Leu 
410 415 

GAG AGG CTC CTT TGG CTT TAC CGG GAG 
Glu Arg Leu Leu Trp Leu Tyr Arg Glu 
425 430 

FIG. 1B 
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6TG GAG AGG CCC CTT TCC GCT GTC CTG GCC CAC ATG GAG GCC ACG GGG 1464 
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly 
435 440 445 

GTG CGC CTG GAC GTG GCC TAT CTC AGG GCC TTG TCC CTG GAG GTG GCC 1512 
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala 
450 455 460 

GAG GAG ATC GCC CGC CTC GAG GCC GAG GTC TTC CGC CTG GCC GGC CAC 1560 
Glu Glu He Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His 
465 470 475 480 

CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTC CTC TTT GAC 1608 
Pro Phe Asn Leu Asn Ser Arg Asd Gin Leu Glu Arg Val Leu Phe Asp 
485 . 490 495 

GAG CTA GGG CTT CCC GCC ATC GGC AAG ACG GAG AAG ACC GGC AAG CGC 1656 
Glu Leu Gly Leu Pro Ala He Gly Lys Thr Glu Lys Thr Gly Lys Arg 
500 505 510 

TCC ACC AGC GCC GCC GTC CTG GAG GCC CTC CGC GAG GCC CAC CCC ATC 1704 
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro lie 
515 520 525 

GTG GAG AAG ATC CTG CAG TAC CGG GAG CTC ACC AAG CTG AAG AGC ACC 1752 
Val Glu Lys He Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr 
530 535 540 

TAC ATT GAC CCC TTG CCG GAC CTC ATC CAC CCC AGG ACG GGC CGC CTC 1800 
Tyr lie Asd Pro Leu Pro Asp Leu He His Pro Arg Thr Gly Arg Leu 
545 550 555 560 

CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGC AGG CTA AGT AGC 1848 
His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 

TCC GAT CCC AAC CTC CAG AAC ATC CCC GTC CGC ACC CCG CTT GGG CAG 1896 
Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg Thr Pro Leu Gly Gin 
580 585 590 

AGG ATC CGC CGG GCC TTC ATC GCC GAG GAG GGG TGG CTA TTG GTG GCC 1944 
Arg He Arg Arg Ala Phe He Ala Glu Glu Gly Trp Leu Leu Val Ala 
595 600 605 

CTG GAC TAT AGC CAG ATA GAG CTC AGG GTG CTG GCC CAC CTC TCC GGC 1992 
Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu Ser Gly 
610 615 620 

GAC GAG AAC CTG ATC CGG GTC TTC CAG GAG GGG CGG GAC ATC CAC ACG 2040 
Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Arg Asp He His Thr 
625 > 630 635 640 

GAG ACC GCC AGC TGG ATG TTC GGC GTC CCC CGG GAG GCC GTG GAC CCC 2088 
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro 
645 650 655 
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CTG ATG CGC CGG GCG GCC AAG ACC ATC AAC TTC GGG GTC CTC TAC GGC 2136 
Leu Met Arg Arg Ala Ala Lys Thr He Asn Phe Gly Val Leu Tyr Gly 
660 665 670 



ATG TCG GCC CAC CGC CTC TCC CAG GAG CTA GCC ATC CCT TAC GAG GAG 2181 
Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr Glu Glu 
675 680 685 

GCC CAG GCC TTC ATT GAG CGC TAC TTT CAG AGC TTC CCC AAG GTG CGG 2232 
Ala Gin Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

GCC TGG ATT GAG AAG ACC CTG GAG GAG GGC AGG AGG CGG GGG TAC GTG 2280 
Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

GAG ACC CTC TTC GGC CGC CGC CGC TAC GTG CCA GAC CTA GAG GCC CGG 2328 
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg 
725 730 735 

GTG AAG AGC GTG CGG GAG GCG GCC GAG CGC ATG GCC TTC AAC ATG CCC 2376 
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 
710 715 750 

GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTG GCT ATG GTG AAG CTC 2121 
Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 

TTC CCC AGG CTG GAG GAA ATG GGG GCC AGG ATG CTC CTT CAG GTC CAC 2172 
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780 

GAC GAG CTG GTC CTC GAG GCC CCA AAA GAG AGG GCG GAG GCC GTG GCC 2520 
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala 
785 790 795 800 

CGG CTG GCC AAG GAG GTC ATG GAG GGG GTG TAT CCC CTG GCC GTG CCC 2568 
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 
805 810 815 

CTG GAG GTG GAG GTG GGG ATA GGG GAG GAC TGG CTC TCC GCC AAG GAG 2616 
Leu Glu Val Glu Val Gly lie Gly Glu Asp Trp Leu Ser Ala Lys Glu 
820 825 830 

TGATACCACC 2626 
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